Search CORE

192 research outputs found

Tweets as impact indicators: Examining the implications of automated bot accounts on Twitter

Author: Bowman Timothy D.
Haustein Stefanie
Holmberg Kim
Larivière Vincent
Sugimoto Cassidy R.
Tsou Andrew
Publication venue
Publication date: 15/10/2014
Field of study

This brief communication presents preliminary findings on automated Twitter accounts distributing links to scientific papers deposited on the preprint repository arXiv. It discusses the implication of the presence of such bots from the perspective of social media metrics (altmetrics), where mentions of scholarly documents on Twitter have been suggested as a means of measuring impact that is both broader and timelier than citations. We present preliminary findings that automated Twitter accounts create a considerable amount of tweets to scientific papers and that they behave differently than common social bots, which has critical implications for the use of raw tweet counts in research evaluation and assessment. We discuss some definitions of Twitter cyborgs and bots in scholarly communication and propose differentiating between different levels of engagement from tweeting only bibliographic information to discussing or commenting on the content of a paper.Comment: 9 pages, 4 figures, 1 tabl

arXiv.org e-Print Archive

Dépôt Institutionnel Numérique

Foundation Model's Embedded Representations May Detect Distribution Shift

Author: Chiang Tony
Engel Andrew
Tsou Adam
Vargas Max
Publication venue
Publication date: 20/10/2023
Field of study

Distribution shifts between train and test datasets obscure our ability to understand the generalization capacity of neural network models. This topic is especially relevant given the success of pre-trained foundation models as starting points for transfer learning (TL) models across tasks and contexts. We present a case study for TL on a pre-trained GPT-2 model onto the Sentiment140 dataset for sentiment classification. We show that Sentiment140's test dataset

M

is not sampled from the same distribution as the training dataset

P

, and hence training on

P

and measuring performance on

M

does not actually account for the model's generalization on sentiment classification.Comment: 14 pages, 8 figures, 5 table

arXiv.org e-Print Archive

Publish or Practice? An Examination of Librarians' Contributions to Research

Author: Finlay S. Craig
Ni Chaoqun
Sugimoto Cassidy
Tsou Andrew
Publication venue: 'Project Muse'
Publication date: 01/01/2013
Field of study

This article examines authorship of LIS literature in the context of practitioner and non-practitioner production of published research. For this study, 4,827 peer-reviewed articles from twenty LIS journals published between 1956 and 2011 were examined to determine the percentage of articles written by practitioners. The study identified a decrease in the proportion of articles authored by practitioners between 2006 and 2011. Topic analysis of articles revealed subtle yet distinct differences in research subject matter between practitioner-authored and non-practitioner-authored articles. If present trends continue, the character of LIS literature may shift away from many issues relating to practical librarianship

Crossref

IUScholarWorks (University of Indiana)

A community of curious souls : an analysis of commenting behavior on TED talks videos

Author: Mongeon Philippe
Sugimoto Cassidy R.
Thelwall Mike
Tsou Andrew
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2014
Field of study

The TED (Technology, Entertainment, Design) Talks website hosts video recordings of various experts, celebrities, academics, and others who discuss their topics of expertise. Funded by advertising and members but provided free online, TED Talks have been viewed over a billion times and are a science communication phenomenon. Although the organization has been derided for its populist slant and emphasis on entertainment value, no previous research has assessed audience reactions in order to determine the degree to which presenter characteristics and platform affect the reception of a video. This article addresses this issue via a content analysis of comments left on both the TED website and the YouTube platform (on which TED Talks videos are also posted). It was found that commenters were more likely to discuss the characteristics of a presenter on YouTube, whereas commenters tended to engage with the talk content on the TED website. In addition, people tended to be more emotional when the speaker was a woman (by leaving comments that were either positive or negative). The results can inform future efforts to popularize science amongst the public, as well as to provide insights for those looking to disseminate information via Internet videos

Directory of Open Access Journals

PubMed Central

Dépôt Institutionnel Numérique

Age stratification and cohort effects in scholarly communication : a study of social sciences

Author: Larivière Vincent
Milojević Staša
Sugimoto Cassidy R.
Sugimoto Thomas
Tsou Andrew
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/07/2016
Field of study

Aging is considered to be an important factor in a scholar’s propensity to innovate, produce, and collaborate on high quality work. Yet, empirical studies in the area are rare and plagued with several limitations. As a result, we lack clear evidence on the relationship between aging and scholarly communication activities and impact. To this end, we study the complete publication profiles of more than 1000 authors across three fields—sociology, economics, and political science—to understand the relationship between aging, productivity, collaboration, and impact. Furthermore, we analyze multiple operationalizations of aging, to determine which is more closely related to observable changes in scholarly communication behavior. The study demonstrates that scholars remain highly productive across the life-span of the career (i.e., 40 years), and that productivity increases steeply until promotion to associate professor and then remains stable. Collaboration increases with age and has increased over time. Lastly, a scholar’s work obtains its highest impact directly around promotion and then decreases over time. Finally, our results suggest a statistically significant relationship between rank of the scholar and productivity, collaboration, and impact. These results inform our understanding of the scientific workforce and the production of science

Dépôt Institutionnel Numérique

Efficient kernel surrogates for neural network-based regression

Author: Chiang Tony
Engel Andrew
Qadeer Saad
Stinis Panos
Tsou Adam
Vargas Max
Publication venue
Publication date: 28/10/2023
Field of study

Despite their immense promise in performing a variety of learning tasks, a theoretical understanding of the effectiveness and limitations of Deep Neural Networks (DNNs) has so far eluded practitioners. This is partly due to the inability to determine the closed forms of the learned functions, making it harder to assess their precise dependence on the training data and to study their generalization properties on unseen datasets. Recent work has shown that randomly initialized DNNs in the infinite width limit converge to kernel machines relying on a Neural Tangent Kernel (NTK) with known closed form. These results suggest, and experimental evidence corroborates, that empirical kernel machines can also act as surrogates for finite width DNNs. The high computational cost of assembling the full NTK, however, makes this approach infeasible in practice, motivating the need for low-cost approximations. In the current work, we study the performance of the Conjugate Kernel (CK), an efficient approximation to the NTK that has been observed to yield fairly similar results. For the regression problem of smooth functions and classification using logistic regression, we show that the CK performance is only marginally worse than that of the NTK and, in certain cases, is shown to be superior. In particular, we establish bounds for the relative test losses, verify them with numerical tests, and identify the regularity of the kernel as the key determinant of performance. In addition to providing a theoretical grounding for using CKs instead of NTKs, our framework provides insights into understanding the robustness of the various approximants and suggests a recipe for improving DNN accuracy inexpensively. We present a demonstration of this on the foundation model GPT-2 by comparing its performance on a classification task using a conventional approach and our prescription.Comment: 29 pages. software used to reach results available upon request, approved for release by Pacific Northwest National Laborator

arXiv.org e-Print Archive

Scientists popularizing science : characteristics and impact of TED talk presenters

Author: Larivière Vincent
Macaluso Benoit
Mongeon Philippe
Sugimoto Cassidy R.
Thelwal Mike
Tsou Andrew
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 30/04/2013
Field of study

The TED (Technology, Entertainment, Design) conference and associated website of recorded conference presentations (TED Talks) is a highly successful disseminator of science-related videos, claiming over a billion online views. Although hundreds of scientists have presented at TED, little information is available regarding the presenters, their academic credentials, and the impact of TED Talks on the general population. This article uses bibliometric and webometric techniques to gather data on the characteristics of TED presenters and videos and analyze the relationship between these characteristics and the subsequent impact of the videos. The results show that the presenters were predominately male and non-academics. Male-authored videos were more popular and more liked when viewed on YouTube. Videos by academic presenters were more commented on than videos by others and were more liked on YouTube, although there was little difference in how frequently they were viewed. The majority of academic presenters were senior faculty, males, from United States-based institutions, were visible online, and were cited more frequently than average for their field. However, giving a TED presentation appeared to have no impact on the number of citations subsequently received by an academic, suggesting that although TED popularizes research, it may not promote the work of scientists within the academic community

Directory of Open Access Journals

PubMed Central

Dépôt Institutionnel Numérique

How Does TED Talk? A Preliminary Analysis

Author: Demarest Bradford
Sugimoto Cassidy R.
Tsou Andrew
Publication venue: 'iSchools'
Publication date: 15/03/2015
Field of study

TED Talks is one of the leading science communication initiatives in the digital age. Although previous work has analyzed the demographics of speakers & audience reaction to TED Talks, there is a dearth of research into the actual content of these talks. The transcripts for TED videos were downloaded from the official TED website and analyzed as to word use by different speaker classes (male academics, female academics, male non-academics, and female non-academics). The two subpopulations (males vs. females; academics vs. non-academics) exhibited marked differences in the words that they used during their talks, which may indicate different sentiments, topical preoccupations, and goals for the presentation. Gender was an important variable throughout the study, indicating an issue worthy of further investigation.ye

Illinois Digital Environment for Access to Learning and Scholarship Repository

Body sway predicts romantic interest in speed dating

Author: Bosnyak Dan J.
Chang Andrew
Kragness Haley E.
Thiede Anja
Trainor Laurel J.
Tsou Wei
Publication venue
Publication date: 01/01/2021
Field of study

Social bonding is fundamental to human society, and romantic interest involves an important type of bonding. Speed dating research paradigms offer both high external validity and experimental control for studying romantic interest in real-world settings. While previous studies focused on the effect of social and personality factors on romantic interest, the role of non-verbal interaction has been little studied in initial romantic interest, despite being commonly viewed as a crucial factor. The present study investigated whether romantic interest can be predicted by non-verbal dyadic interactive body sway, and enhanced by movement-promoting (‘groovy’) background music. Participants’ body sway trajectories were recorded during speed dating. Directional (predictive) body sway coupling, but not body sway similarity, predicted interest in a long-term relationship above and beyond rated physical attractiveness. In addition, presence of groovy background music promoted interest in meeting a dating partner again. Overall, we demonstrate that romantic interest is reflected by non-verbal body sway in dyads in a real-world dating setting. This novel approach could potentially be applied to investigate non-verbal aspects of social bonding in other dynamic interpersonal interactions such as between infants and parents and in non-verbal populations including those with communication disorders.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Big Data, Bigger Dilemmas: A Critical Review

Author: Andrew Tsou
Arave G.
Bowman Timothy
Ekbia Hamid
Ghazinejad Ali
Koupe Inna
Mattioli Michael
Sugimoto Cassidy R.
Suri Venkatq R.
Weingart Scott
Publication venue: Digital Repository @ Maurer Law
Publication date: 01/08/2015
Field of study

The recent interest in Big Data has generated a broad range of new academic, corporate, and policy practices along with an evolving debate among its proponents, detractors, and skeptics. While the practices draw on a common set of tools, techniques, and technologies, most contributions to the debate come either from a particular disciplinary perspective or with a focus on a domain-specific issue. A close examination of these contributions reveals a set of common problematics that arise in various guises and in different places. It also demonstrates the need for a critical synthesis of the conceptual and practical dilemmas surrounding Big Data. The purpose of this article is to provide such a synthesis by drawing on relevant writings in the sciences, humanities, policy, and trade literature. In bringing these diverse literatures together, we aim to shed light on the common underlying issues that concern and affect all of these areas. By contextualizing the phenomenon of Big Data within larger socioeconomic developments, we also seek to provide a broader understanding of its drivers, barriers, and challenges. This approach allows us to identify attributes of Big Data that require more attention—autonomy, opacity, generativity, disparity, and futurity—leading to questions and ideas for moving beyond dilemmas

arXiv.org e-Print Archive

bepress Legal Repository

Indiana University Bloomington Maurer School of Law